The optimization and shock waves in evolution dynamics 
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We consider the optimal dynamics in the infinite population evolution models with general sym- 
metric fitness landscape. The search of optimal evolution trajectories are complicated due to sharp 
transitions (like shock waves) in evolution dynamics with smooth fitness landscapes, which exist 
even in case of popular quadratic fitness. We found exact analytical solutions for discontinuous dy- 
namics at the large genome length limit. We found the optimal mutation rates for the fixed fitness 
landscape. The single peak fitness landscape gives the fastest dynamics to send the vast majority 
of the population from the initial sequence to the neighborhood of the final sequence. 

PACS numbers: 87.23.Kg, 64.60.De 



I. INTRODUCTION 

The search of optimization in evolutionary processes 
is a rather popular idea in evolution research [l], 0, H, 
HI IE IE 0] ■ Here we should distinguish the optimization 
via mutation rate Q and via fitness landscapes [|[ . The 
first is rather relevant for the real biology. According to 
Darwinian view to evolution, the force driving evolution 
is the natural selection, not the creation of genetic vari- 
ants, because the mutations are random. Nevertheless 
there have been experimental results which suggest that 
mutation rates can vary, e.g. increasing during certain 
stresses [l| . This phenomenon has been called " adaptive 
mutation" , which means that the mutation rate is under 
the selective pressure @]. 

We will give the theory of optimal mutation rates in 
case of infinite populations. The realistic case, of cause, 
is connected with the finite population, and our results 
could be considered just as a first step in that direction. 

From the early days of Darwin- Wallace evolution the- 
ory [8] there has been a hope that there is some optimiza- 
tion of the fitness during the evolution process. Such pic- 
ture really was confirmed in the case of proteins In 
connection with [3] has been considered a mathematical 
problem. What has been examined was a direct par- 
allelism with the famous Brachistochrone problem sug- 
gested in 1696 by Johann Bernoulli. Given two mutants, 
A and B, separated by n mutational steps, what is the 
evolutionary trajectory which allows a homogeneous in- 
finite population of A to reach B in the shortest time? 
In [H has been considered an approximate solution of 
the mathematical problem in the case of finite popula- 
tion, considering an optimization via fitness landscape. 
To find an optimized evolution trajectory, one should 
have, first of all, exact solutions of evolution dynamics. 
Such solutions have been found only recently in the case 
of a single-peak fitness landscape for the Crow-Kimura 
li, [ll| in [H and for the Eigen model [H, El in 
12]. In [16] has been solved the evolution dynamics (the 
manner of change of the mean number of mutations in 
population) for the general symmetric fitness landscape 



case, derived using the Hamilton-Jacobi equation (HJE) 
method [TtI EH- The optimization problem is highly 
complicated due to discontinuous effects in the dynamics 
of a mean number of mutations, as has been found in 
[l6| (see also pjl H(|). The phenomenon exists even in 
smooth fitness landscape. In Fig. 2 there is an example 
of such discontinuity. The overlap x* (mean number of 
mutations is defined as 1 — 2x* /N, N being the genome 
length) is first a smooth function of time, then its value 
jumps from the low branch of the S like loop to the upper 
one, and again is a smooth function of time. As evolu- 
tion equations are exactly mapped into HJE equations, 
one can introduce Hamiltonians and corresponding po- 
tentials. In [l|| has been suggested a receipt to identify 
a class of discontinuities: when the evolution potential 
has two local maximums. It has been observed also that 
the discontinuous dynamics could occur even for the case 
of a single maximum in evolution potential, when the fit- 
ness function is steep enough. The phenomenon is rather 
involved and in [lj| neither the "enough steepness" could 
be identified nor the position of sharp transitions for any 
case of discontinuities. We will give an analytical theory 
for some cases of discontinuous dynamics, and exact re- 
sults for the optimization via mutation rate. We found 
discontinuities even in the quadratic fitness landscapes, 
missed in 1161. 



II. THE MUTATION OPTIMIZATION 



A. Crow-Kimura model with asymmetric 
mutations 



Consider the case of model with different forward and 
back mutation rates as in [3j . For symmetric-fitness land- 
scapes this model [ll[ assumes that the relative proba- 
bilities pi, I = 0,1,..., N (N being the genome length), 
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where we have the following expression for F: 



dpi 
dt 



(1) 



Pi [Nf (mi) - {if -ib)l + lbN] 
+ 7h (N — I + 1) + 7/ (/ + l)pi+i, 

where m; = 1 — 21/N, and p; are relative probabilities at 
the Hamming distance 1 (1 mutations); f(x) is a fitness 
function; and 7/, 7& are the mutation rates. In Eq.(l), for 
/ = and I = A~ we omit p-i and pat+i. Probability of 
having a molecule at Hamming distance I from the master 
is Pi/J2kP k - A s in Ref.[l7|, at a discrete a; = 1 — 21/N 
we use the ansatz: pi(t) = p(x,t) ~ exp[A^u(x, i)]. Eq. 
(1) can be written as Hamilton- Jacobi equation for u = 
\np(x,t)/N E3: 

-JT(x,p) = /(x) - ((1 - x) 7/ + 76(1 + x))/2 + 
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7^e^+7/^e- 2 ^ (2) 

where m' = du/dx, the domain of x is — 1 < x < 1, 
and the initial distribution is u(x,0) = uq(x). Minimiz- 
ing —H(x,p) via p, we get the expression of evolution 
potential, 

U(x) = f(x) + ^/WffV 1-x 2 - - (3) 

The evolution behavior is defined by the evolution po- 
tential (l6| . 

We solved Eq.(2) for 7/ = lb_= 7 case in [3 by a 
method of characteristics [2l|, [22j]. For the character- 
istics line x(t) we have a Hamilton equation dx/dt = 
dH(x,p)/dp. In our case Eq.(2) gives: 



i = ±2 A /A: 



7/7f)(l 



fc = g + 7/ 



1-x 



lb- 



fix) 



(4) 



where q = du(x,t)/dt is constant along the characteris- 
tics, like the energy of the particle in classical mechanics. 
At every point we have two characteristics, moving to the 
right and left. 

We consider the dynamics of the population in the 
Crow-Kimura model, originally having fixed overlap xq 
with the reference (master) sequence. Let us look at the 
manner of change in the mean overlap of the population 
x*(t*) = J2 3 Pji 1 - 2di/N) at the moment of time t*. 
Pj is the fraction of the type j in the population, dj 
is the number of mutations in the j-th type (compared 
with the master sequence), and such mutant has a fitness 
Nf(l — 2di/N). As time progresses the overlap distribu- 
tion spreads out and so we focus on the time evolution of 
overlap x* that yields the maximum of this distribution. 

Following to derivations of [16( , we derived for the large 
initial xq 
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(5) 



F(i,x*,0 = 



f (0+7/^+76^-/(0 



-7/76 (1 - £ 2 ) (6) 
For the small xo there is another expression: 

xi 

t* = \ldt [F(i,x*,or 1/2 



+ \l d z [F{i,x*,or 1/2 



and xi is the solution of 



F(7,^* ! x 1 ) = 0. 



(7) 



(8) 



In [T(| has been considered the symmetric mutation 
scheme 7/ = ib = 7 with F s instead of F: 

F s (l, x*,0 = [/ (x*) + 7 - I (Of - 7 2 (1 - e) (9) 



For the quadratic fitness function 



(10) 



Eqs. (5), (7) have real solutions provided that x* < 
1 — l/c. This upper bound determines the asymptotic 
value of the overlap with the reference sequence. Of 
course, in the case 7/c > 1 the selective phase is lost 
and the dynamics drifts in the sequence space so that 
the asymptotic regime is characterized by a zero overlap 
with the reference sequence. To decide which equation 
to use we need to calculate 
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th = \j d£ [F s (i,x h ,OY 
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(11) 



where Xh is a root of F s (7, x/j,Xo) = 0. This equation 
has a solution provided that / (xh) < / (xo), which in the 
case of monotonically increasing fitness implies Xh < Xo- 
Thus for a given xo and t* we calculate Xh and then th- 
If t* < th we use Eq. otherwise we use Eq. (8), to 
obtain x* — x* (t*). 



B. Discontinuous dynamics in case of quadratic 
fitness function 

In (l6l | there have been derived analytical formulas 
Eqs. (5), (7) with F = F s . The failed to describe 
the discontinuities of x* (t) analytically. The mean fit- 
ness is defined as a minimum of U (x) . When this func- 
tion has two maxima at 1 > x > 0, there is a discon- 
tinuity in the dynamics, [l||. The point is that there 
can be singularities in the dynamics, even for the fitness 
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FIG. 1: The dynamics of x*(t*) (most likely value of the 
overlap with the reference sequence as function of time t*) 
by Eq. (7) for the symmetric case 7/ = 7b = 7 and (top to 
bottom at ct* = 2) j/c = 0.05, 0.1, . . . , 0.7, 0.75. The initial 
population has overlap xo — 0.01 with the reference sequence. 
For t* — > oo we find x* — 1 — y/c. 




FIG. 2: Numerical solution of the ODE system (7) for xo — 
0.01, 7/c = 0.05 and (dashed vertical lines from left to right) 
N = 2000, 4000, . . . , 12000. For TV -> oo the jump in x* takes 
place at ct* = ct*? = 1.939 and has size Ax* — 0.755. 



with a single maximum at x > 0, when the fitness is too 
steep. We performed a numerics for symmetric mutations 
(7/ — 7& — 7) to clarify the character of discontinuous 
dynamics, see Fig.l-Fig.3. In the selective phase c > 7, 
the potential U(x) has a single maximum at 1 > x > 0. 
Nevertheless, sometimes the function x*(t*) has jumps. 

Figure [1] illustrates the time evolution of x* for xo = 
0.01. For not too small 7/c the x*(t*) is a monotonic 
function, and the direct numerics of the system of equa- 




FIG. 3: The critical line 7 c /c vs. £0 at which Ax* = 0. The 
critical c is defined from the system of equations dx* /dt* = 
0, d 2 x*/d 2 t* = 0. Below this curve the most probable overlap 
x* undergoes a discontinuous transition at t* — t* d (see Fig. 
0. 



FIG. 4: The relaxation from the original flat distribution with 
the fitness function /(m) = 4 * exp(8(m — 1)), 7 = 0.1, N = 
1000, 5000, 10000. The jump is at the point dx*/dt* at TV = 



tion for Crow-Kimura model supports well the theoretical 
formulas for x*(t*). 

For small values of 7/c the S-shaped curves indicate 
the existence of a discontinuity in the position of the 
maximum of the overlap probability distribution. This 
threshold phenomenon was overlooked in a previous anal- 
ysis of this problem which considered a single parameter 
setting, c = 2 and 7=1 ■ 

The unusual time dependence of x* exhibited in Fig. [T] 
is quite counter-intuitive since it implies that for, say, 
7/c = 0.05 there is an entire range of overlap values 
which are never reached by the evolutionary dynamics. 
To check that finding and to gather information on the 
stability of the solutions in the multi-solution regime, we 
present in Fig. 2 the results of the numerical solution 
of the ODE system (1) for different values of sequence 
lengths. These results not only confirm the theoreti- 
cal predictions but complement them by showing that 
the solution corresponding to the lower branch of the S- 
shape is the stable one. This information allows us to 
obtain the value t* = t* d at which the discontinuity takes 
place as well as the size of the discontinuity Ax* . This 
can be done by locating the lower value of x* for which 
dt*/dx* = in Eq. (7). 

Our conclusion, deduced from the analysis of Fig. 2 
that the jump in the dynamics occurs at the point where 
dx* /dt* = 0, is a rather general one. We checked that 
it is valid in other cases with discontinuous dynamics as 
well, see Fig. 4. 



C. Optimal mutation rates in case of fixed overlap 
value in original population 

Here we explore another important result exhibited in 
Fig-HJ namely, that there is an optimal value of the scaled 
mutation rate 7/c that minimizes the evolutionary time 
to go from xq to x* > xq. To assess this point in more 
detail, we note first the obvious fact that this evolution- 
ary trajectory is possible only for 7/c < 1 — x*. With 
this fact in mind, we can see from Fig. [T] that to reach 
the end point, say, x* = 0.2 it is a bad strategy to choose 



y/c 

FIG. 5: Time period t* needed for the maximum of the 
overlap distribution to reach the values (right to left) x* — 
0.1, 0.2, . . . , 0.8 as function of y/c. The initial population has 
overlap xo = 0.01 with the reference sequence. The dynamics 
can reach x* provided that y/c < 1 — x* . 



ct* 

FIG. 6: Location x* of the maximum of the overlap distribu- 
tion as function of time t* for the symmetric case 7/ = yt — y 
and (top to bottom) y/c = 0, 0.1, 0.2, . . . , 0.8. The initial 
population is uniformly distributed in sequence space. For 
t* — > 00 we find x* — 1 — y/c. 



either small or large values of y/c. In fact, there is an 
optimal value of the mutation rate, which for the param- 
eter setting of this example (x = 0.01 and x* = 0.2) 
is y pt/c = 0.3632. This interesting analytical finding 
substantiates the empirical strategy of fine tuning the 
mutation rate in Genetic Algorithms f23j. 

Figure [5] neatly illustrates the existence of an optimal 
mutation rate for the fixed initial condition xq = 0.01. To 
draw one of the curves in this figure we keep the end point 
x* fixed and measure the evolution time as a function of 
the scaled mutation rate. The existence of an optimal 
mutation rate that corresponds to the fastest evolution- 
ary trajectory (i.e., minimum t*) for the particular fitness 
choice, Eq. ([T0]1 . is patent from this figure. 

To find the exact location of the minima exhibited in 
Fig. [5] we put the condition = 0, and get 

1 ? d£ dF s (y,x*,0 1 ? d£ dF s (y,x*,0 _ 



and 



4 J p 3 / 2 dy 



4 J p 3 / 2 dy 



_L dy 

e l/2 dF s (-y,x',() 
dx\ 

F s (y,x*,xx)=^2) 

Here e must be set to a small (but not too small) value, 
typically e = 10~ 6 . We also simply opted for the direct 
numerical derivation of the curves shown in Fig. [5] What 
is surprising is that the optimal mutation rate y op t grows 
very steeply as 1* departs from xq and quickly reaches 
a maximum value. Looking carefully the Eq. (7), we 
see an important issue. The optimization depends on 
the behavior of the fitness function outside the interval 
[x ,x*]. 



D. Originally flat distribution 

For the initially flat distribution we have [l6| 

1+x 1+x 1-x 1-x 
u[0,x) = — In— — In— — , (13) 



xi 

t* = \j [F s (y,x*,0Y 



1/2 



(14) 



where F s and x\ are defined by Eqs. (JU) and (J8J) , respec- 
tively. As pointed out in Ref. [101, the initial overlap 
distribution has a peak at x = 0, which yields the maxi- 
mum of the overlap distribution for t < to where [l6j 



ct 



(1 - y/c) 



1/2 



[7/c(l-7/c)] 



1/2 



(15) 



Note that cio — * 1 for y/c — > 0, and cto w 
y/2(l — y/c) ^ 2 for y/c —> 1. We turn now to the 
case where the initial population is uniformly distributed 
among the 2 N configurations. 

Figure [S] shows the time evolution of x* for the flat ini- 
tial distribution. The results are in stark contrast with 
those of the peaked initial distribution (see Fig. [T|) : the 
odd S-shaped curves that produced the interesting dy- 
namic behavior discussed before are absent in this case. 
In addition, the curves for different values of y/c never 
cross which indicates that the fastest trajectory to reach 
any point x* is given by a vanishingly small mutation 
rate. 

Let us calculate the optimal period to have a peak of 
population with the overlap x* at the moment of time t* . 
We should find the x*(t*) looking the maximum of the 

1 + x * 1 1 + x * 1 ~ x \ 1_:r * , +*a *\ rid\ 
7T~ ln ^7— +* f( x ) ( 16 ) 



III. 



FITNESS OPTIMIZATION 



Although the selection of a fitness function that mini- 
mizes the evolution time between any two points xq and 
x* , which correspond to the maximum of the overlap dis- 
tribution in two distinct times, is not as biologically sig- 
nificant as the selection of the optimal mutation rate, it 
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has a considerable aesthetical appeal as the problem is 
somewhat akin to the Brachistochrone problem of physics 
Q . In [|| has been assumed that the fastest evolution dy- 
namics between two sequences is given by a single peak 
fitness. The point is that one should accurately formulate 
the optimization task. The first possibility- we look the 
arrival of some fraction of population to the master peak. 
The second version: we look the arrival of a vast major- 
ity of population to the small (the Hamming distance is 
miserable compared with N) neighborhood of the master 
sequence. The situation is highly non-trivial. If we took 
the first version with some small fraction, then the lin- 
ear fitness could give better results than the single-peak 
fitness, see the Fig. 7. 

If we take the second version of optimization, then the 
single-peak fitness looks like as the fastest one. For the 
considered case (from sequence to sequence) we can just 
give the expression of the minimal time following to the 
results by [12 ] . 

For the symmetric mutation case the fastest relaxation 
gives the single-peak fitness landscape (r = J for the 
peak sequence and = for the other sequences). In [l2| 
has been found the relaxation period to send the popula- 
tion from the given sequence (at the Hamming distance 
N(l — m)/2 from the peak sequence) to the peak one. 
To find the minimal time t we just add the optimization 
condition via the choice of 7 to the solution of [151 ]: 



■ tanh(7^i) 



t = 



1 



(x,*i) - Jh 



J 



<j>{x,t) 



■ lncosh(7i) 



2 tanh(7ii 
1 - x 



) 1 



"7 

^=0 



In sinh(7t)] 
07 



(17) 



We have done some numerics, see Fig. 7, supporting the 
choice of single-peak fitness as an optimal fitness for the 
fastest relaxation, and t by Eq. (17) as a minimal time 
period. 

Consider now the fitness optimization problem in case 
of overlap distributions (to send the population from the 
original overlap with m = xq to the eventual one with 
to = x*) and symmetric fitness landscape. We are look- 
ing the optimization problem for the special fitness with 

f(m) = 0,x < x* 

f{x*)=J (18) 

Eq.(8) gives x\ = x*, then Eq. (7) is simplified: the first 
term disappears. For the fitness by Eq.(18) we have 



t* 



^{j+i) 2 -i 2 {i-e) 



(19) 



It is easy to check that the minimal time is given by the 
fitness of Eq.(18). As 




FIG. 7: Location These are the results of the simulations 
for N = 20, symmetric mutation rate 7 = 1. Fitness f(x) = 
2Nx a for a = 1,2,4 (show in the figure) SP f{x) = for 
x < 1 and /(l) = 2N. 




FIG. 8: Location x* of the maximum of the overlap distri- 
bution as function of time t* for directed mutation case with 
x = 0, (left to right) 7//c = 0.1,0.05 and 0.01. The numeri- 
cal solution of the system (1) is given by the dashed vertical 
lines for (left to right) N = 1000, 5000 and 10000. 



the time given by Eq.(19) is less than the time given by 
Eq.(7) for any f (m) > 0. 



IV. 



DIRECTED MUTATION CASE 



Consider the case of asymmetric mutations [3(. We 
have original distribution at some xo, and our goal is to 
send the population to the overlap x* . Now there is a 
single characteristic, therefore, contrary to the symmet- 
ric mutation case, all the properties are defined via the 
behavior of the fitness function in the considered interval 
[xq,x*]. Eqs.(4),(5) give the following equation 



t* = 



1 



dx 



f(x*)+^-f(x) 



(21) 



vV + 7) 2 - 7 2 (i - e) >v(j+i- f(0) 2 - 7 2 (i - em 



We see that the optimization via mutation is trivial: rais- 
ing the mutation rate we can send the population to 
the point x* immediately. The optimization via fitness 
is also trivial: the fastest trajectory is via the fitness 
/(to) = 0, to < to* and /(l) = J . 

We have done a numerics for quadratic fitness case, see 
Fig. 8. We see that the results again support the conjec- 
ture that the jumps are at the point with dx* jdt* = 0. 
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V. DISCUSSION 

We considered the problem of optimization in evolution 
in case of infinite population, symmetric fitness landscape 
and large genome length and found exact solutions. We 
found that the optimization (optimal control, see (23 |) 
is a highly non-trivial problem, as sharp, discontinuous 
transitions are typical for the evolution dynamics even 
with smooth fitness landscapes. We investigated these 
discontinuities and gave an analytical description of such 
sharp transitions for symmetric smooth landscape. We 
could succeed doing numerics for a larger values of N than 
those in (l6| . We found dynamical discontinuities even 
in case of directed mutations. The sharp transitions in 
evolution are important regarding the punctual evolution 
phenomenon (see [25[ and the review |26l|). 

The optimization via mutation rate is most intriguing 
from the point of view of adaptive mutations. We cal- 
culated the minimal time to send the population from 
original sequence with small overlap (with the master 
sequence) and low fitness to the some final one (with a 
higher fitness). The solution of the optimization problem 
is nontrivial, and there is some optimal mutation rate. It 
is interesting that the optimal rate of mutation to send 
the population from the overlap xo to x* is defined with 
the behavior of the fitness outside the interval [io,i*]. 
The numerics confirm our analytical results. On the con- 
trary, when we need to send the population from the 
high fitness configurations with the fixed original overlap 
to the final one with lower overlap, the mutation's opti- 
mization is a trivial task: just increase the mutation rate. 
Similar is the situation in case of directed mutation: one 
can send the population in a fastest way just increasing 
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